T-Star (T*): An x86-64 ISA Extension to support thread execution on many cores

نویسندگان

  • Antoni Portero
  • Zhibin Yu
  • Roberto Giorgi
چکیده

The number of cores per chip keeps increasing in order to improve performance while controlling the power. According to semiconductor roadmaps, future computing systems will reach the scale of 1 Tera devices in a single package and therefore manycore (e.g. 1000 or more) will be the norm. Here, we describe an ISE (ISA Extension) that we are experimenting in the x86-64 ISA in order to provide an efficient, fast support for fine-grained threads. The new ISE enables a different execution model based on the availability of data and opens the doors for many architectural optimizations not possible in current cores. We also describe the architectural support related to the T* extension

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Architecture and Simulation for Executing Decoupled Threads in Future 1-kilo-core Chips

T-Star (T*) is an ISA-extension that supports a promising execution model to exploit Thread Level Parallelism (TLP) in designing for the next generation chip. This model relies on DataFlow principles. A compiler partitions the program into non-blocking threads which start consuming their own data frames when all their inputs become ready. Especially for future systems composed of thousands of c...

متن کامل

Serializing instructions in system-intensive workloads: Amdahl's Law strikes again

Serializing instructions (SIs), such as writes to control registers, have many complex dependencies, and are difficult to execute out-of-order (OoO). To avoid unnecessary complexity, processors often serialize the pipeline to maintain sequential semantics for these instructions. We observe frequent SIs across several system-intensive workloads and three ISAs, SPARC V9, X86-64, and PowerPC. As e...

متن کامل

ReFLEX: Block Atomic Execution on Conventional ISA Cores

Modern multicore chips target thread-level parallelism at the expense of increasing instruction-level parallelism from single threaded programs. While recent work has attempted to construct a wide-ILP machine from multiple simple cores, these approaches suffer from ISA overheads or scalability challenges. In this paper, we describe an architecture that is inspired by the scalability and flexibi...

متن کامل

A scalable thread scheduling co-processor based on data-flow principles

Large synchronization and communication overhead will become a major concern in future extreme-scale machines (e.g., HPC systems, supercomputers). These systems will push upwards performance limits by adopting chips equipped with one order of magnitude more cores than today. Alternative execution models can be explored in order to exploit the high parallelism offered by future massive many-core...

متن کامل

Evaluation of a hardware implementation of the SVP concurrency model

SVP is a general concurrency model that has been implemented in the ISA of a multi-threaded core, both of which support dataflow synchronisation with imperative programming. This core is used as a building block to design systems-on-chip comprising many cores, either for general-purpose use or for specific applications. The major advantages of this implementation include asynchrony, i.e. the ab...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011